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Recombination in large RNA viruses: Coronaviruses 

M ichad M . C. Lai 



Coronaviruses contain a very large RNA genome, which 
undergoes recombination at a very high frequency of nearly 
25% for the entire genome Recombination has been 
demonstrated to occur between viral genomes and between 
defective-interfering (Dl) RNAs and viral RNA. It provides 
an evolutionary tool for both viral RNAs and Dl RNA and 
may account for the diversity in the genomic structure of 
coronaviruses. The capacity of coronaviruses to undergo 
recombination may be related to its mRNA transcription 
mechanism, which involves discontinuous RNA synthesis, 
suggesting the nonprocessive nature of the viral polymerase 
Recombination is used as a tool for the mutagenesis of viral 
genomic RNA. 
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Coronaviruses contain an extraordinarily large RNA 
genome (27-31 kb). This large RNA size imposes a 
severe burden on the virus because such an RNA can 
be expected to accumulate a large number of errors 
during RNA replication, assuming that the error 
frequency of coronaviral RNA polymerase is compara¬ 
ble to that of other RNA viruses. Thus, coronaviruses 
must develop genetic mechanisms to counter the 
potentially deleterious effects of the errors. RNA 
recombination is one such mechanism. 

The discovery of RNA recombination in coro¬ 
naviruses 1 was made at a time when only picorna- 
viruses, but no other RNA viruses, had been demon¬ 
strated to be capable of RNA recombination. And it 
came with a vengeance, as murine coronaviruses were 
quickly shown to recombine at very high frequency 
under a variety of natural and experimental condi¬ 
tions. The capacity to recombine has now been 
demonstrated in several different coronaviruses. 
Recombination isan important mechanism contribut- 
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ing to both the genetic stability and diversity of 
coronaviruses in nature. 


Characteristics of coronavirus RNA and its 
synthesis 

The coronavirus RNA genome is a single species of 
single-stranded, positive-sensed RNA 27-31 kb in 
length (see Lai 1990). 2 It consists of seven to 10 genes, 
one of which (gene 1) encodes a precursor of RNA 
polymerase of approximately 750-800kDa. Gene 
composition and arrangement vary among the differ¬ 
ent coronaviruses (Figure 1). The enormous size 
(22 kb) of the polymerase gene suggests that the 
polymerase has multiple functions. Each gene is 
expressed through one of the mRNAs, which are 
3'-coterminal and have a nested-set structure. Only 
the 5'-most gene of each mRNA is functional for 
protein translation. Each mRNA has a leader 
sequence of 70-90 nucleotides derived from the 
5'-end of the genome RNA. mRNA transcription is 
carried out by a discontinuous transcription mecha¬ 
nism which fuses the leader RNA to the transcription 
start signal (intergenic sequence). The mRNA leader 
sequence is usually derived in trans from a different 
RNA molecule. 3,4 Therefore, the coronaviral polymer¬ 
ase must jump between the leader sequence and 
intergenic sequences in different RNA molecules 
during positive- or negative-strand RNA synthesis. 


Recombination between viral genomes 

The first coronavirus recombinant was isolated by 
coinfecting temperature-sensitive (ts) mutants of two 
mouse hepatitis virus (M H V) strains, A59 and J H M, 
and selecting progeny viruses which grew at the 
nonpermissive temperature. 1 The identity of this 
recombinant was established by genomic sequence 
analysis, which showed that it indeed had one cross¬ 
over site and contained sequences from both parents. 
Subsequently, additional recombinants were obtained 
using different pairs of ts mutants and other selection 
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markers, including monoclonal antibody neutral- Therefore, M HV recombination likely occurs at such 

ization epitopes and cell-cell fusion ability. 2 3 4 5 6 7 - 9 a high frequency that recombinants are selected 

Although recombination frequency was not deter- without specific selection pressure. The high fre- 

mined in these earlystudies, the ease with which these quency of recombination was also demonstrated in an 

recombinants were isolated suggested that the recom- experiment in which an A59 ts mutant and wild-type 

bination frequency of MHV was very high. This was JHM were used for a mixed infection. 8 The recombi- 

also suggested from the finding that many of these nant viruses that grew at the nonpermissive tem- 

recombinantshad multiple cross-overs, some of which perature became the predominant virus population 

were surprisingly located outside of the two selection after only two tissue culture passages. This result was 

markers used for the isolation of the recombinants. striking because one of the parental viruses (JH M) was 
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Figure 1 . Genome structures of the Coronaviridae family. Comparable genes in the different 
coronaviruses are represented by the same fill patterns. Inverted triangles represent the 
transcription start signals (intergenic sequences). The mRNAs made from the signals are shown for 

MHV only in the lower half of the figure (mRNAs are named 1-7, corresponding to genes 1-7). 
Arrows indicate the translation termination sites on each mRNA. The open arrow in mRNA 5 
indicates an internally initiated ORF. Gene 1 containstwo overlapping open reading frames, which 
are translated by a ribosomal frameshifting mechanism. The functions of the genes that are 
represented by unfilled boxes are unknown. HE, hemagglutinin-esterase; S, spike protein; E, 
envelope protein; M, membrane protein; N, nucleocapsid protein; L, leader sequence. Black 
circles indicate the identified 'hotspots’ for recombination. IBV and MHV belong to genus 
coronavirus, whereas Torovirus is in a separate genus. IBV, avian infectious bronchitis virus; MHV, 
mouse hepatitis virus. 
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not selected against, suggesting that the recombinants 
had evolutionary advantages over the parental viruses 
under the experimental conditions. The finding 
indicated that recombination could serve as a tool for 
virus evolution. 

Recombination can occur almost everywhere in the 
M H V genome. H owever, some cross-over sites appear 
to be restricted in recombination between certain 
pairs of viruses. For example, cross-overs in the 3'-end 
of the viral genome were rarely detected in the 
recombination between A59 and JHM, and yet they 
occurred frequently between MHV-2 and A59. 7 As 
discussed below, this finding probably reflects the 
possibility that recombinants with chimeric viral 
proteins derived from certain pairs of parental viruses 
might be unstable or have an inferior replication 
ability, and, therefore, were selected against during 
virus growth. So far, only homologous recombination 
has been detected between coro navi ruses. This is in 
contrast to the frequent occurrence of nonhomolo- 
gous or aberrant homologous recombination seen in 
other RNA viruses, such as turnip crinkle virus, brome 
mosaic virus or Sindbis virus, 10 ' 12 in which cross-overs 
occur at nonhomologous sites on the two parental 
RNAs despite the presence of homologous sequences 
on them. 13 The absence of nonhomologous recombi¬ 
nation in coronaviruses may reflect their rigid viral 
RNA or protein structure requirements for optimal 
virus growth. 

Recombination occurred not only in tissue culture, 
but also in animal infections, as demonstrated by the 
intracerebral inoculation of MHV into mouse brain. 6 
Again, recombinants were isolated at a very high 
frequency, comparable to that in tissue culture. 

By performing a series of recombination studies 
between different pairs of ts mutants, Baric et al were 
able to establish a linear recombination map for 
MHV. 14 The two most distant ts markers used in that 
study had a recombination frequency of 8.7%. By 
estimating the genetic locations of the ts defects and 
assuming that recombination occurred reciprocally, a 
recombination frequency of approximately 25% was 
extrapolated for the entire MHV genome (31.2kb). 
This recombination frequency translates to approx¬ 
imately 1% recombination for every 1300 nucleotides, 
which is in the same range as the estimated frequency 
for picornaviruses (1% for every 1700 nucleotides); 15 
however, because of the extremely large size of the 
coronavirus RNA, the overall recombination fre¬ 
quency of the MHV appears very large. Subsequent 
recombination mapping studies showed that there is 
an increasing gradient of recombination frequency 


(in the direction of 5'—>3') across the genome. 16,17 
Thisresult is best interpreted as the possible participa¬ 
tion of the subgenomic mRNAs in recombination, 
since the subgenomic mRNAs of coronaviruses have a 
3'-coterminal, nested-set structure, and thus are pref¬ 
erentially enriched in the 3'-end sequence (Figure 
1 ). 

Despite the high frequency of recombination in 
MHV in tissue culture and experimental inoculations 
in animals, there has been no clear-cut evidence for 
the occurrence of recombination among natural 
MHV strains, probably because they have not been 
extensively studied. In contrast, clear-cut evidence of 
recombination has been obtained for natural isolates 
of avian infectious bronchitis virus (IBV), many of 
which have recombination between different strains 
in the spike protein gene or the 3'-end of viral 
RNA. 18 ' 22 Recombination has now been demonstrated 
experimentally for IBV in embryonated eggs 23 and 
TGEV in tissue culture (L. Enjuanes, personal com¬ 
munication). However, the recombination frequency 
in these two viruses may not be very high, as may be 
implied from the difficulty of isolating these 
recombinants. 


Recombination between viral RNA and 
defective-interfering (Dl) RNA; incorporation 
of viral sequences into Dl RNAs 

Dl RNA has traditionally been considered a product 
of nonhomologous recombination during viral RNA 
replication. 13 The generation and structure of coro¬ 
naviruses Dl RNAs will be discussed in the next 
chapter, and thus will be discussed here only in the 
context of RNA recombination. It has been shown 
that once a Dl RNA is generated, its size and structure 
continue to change as it is passaged in tissue culture. 
Thus, the predominant Dl RNA species is different at 
different passage levels. 24 Thisphenomenon isat least 
partially caused by recombination, as shown by the 
evolution of an MHV-JHM Dl RNA passaged in MHV- 
A59-infected cells. 25 In this instance, a novel Dl RNA 
species, which wasdetermined to be a recombinant of 
A59 and JHM, appeared after a few passages. 25 This 
recombination event could have occurred between 
the original JHM Dl RNA and A59 viral RNA or 
between JHM Dl RNA and a new Dl RNA generated 
from the A59 virus. In either case, this finding 
demonstrated that viral RNA sequences can be 
incorporated into Dl RNA by recombination. 

Recombination between the viral RNA and Dl RNA 
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could be even more easily demonstrated when the Dl 
RNA had a weak replication ability. It has long been 
known that M H V Dl RN As containing a long, translat¬ 
able open reading frame (ORF) usually replicate 
better than those with a shorter ORF, although the 
ORF itself is not required for Dl RNA replication. 26 ' 27 
Thus, when a Dl RNA with a short ORF was 
transfected into virus-infected cells, a new Dl RNA 
with a longer ORF quickly became the predominant 
Dl RNA species. 28 ' 29 This was usually the result of 
recombination between the Dl RNA and the viral 
RNA, in which the viral sequences replaced part of the 
original Dl to extend the ORF in the Dl RNA. 

Another type of recombination between Dl RNA 
and the viral RNA involves the transfer of the leader 
sequence of the helper virus genomic RNA to the Dl 
RNA, replacing the original Dl RNA leader 
sequence. 30 ’ 31 This occurs only when a stretch of the 
repetitive sequence, which resembles the transcrip¬ 
tion start signal, is present immediately downstream 
of the leader sequence in the Dl RNA. 430 The 
recombinant Dl RNA can become the predominant 
species within just one replication cycle. Thus, the 
leader junction site may be considered a hot spot of 
recombination. This type of recombination is very 
similar to coronavirus mRNA transcription, which 
uses a discontinuous transcription mechanism involv¬ 
ing a separate leader RNA. The free leader RNA used 
for transcription may also be involved in this type of 
recombination. Therefore, thistype of recombination 
probably uses a mechanism similar to mRNA 
transcription. 


Targeted RNA recombination: incorporation of 
Dl RNA sequences into viral genomic RNAs 

The reciprocal outcome of recombination between 
the Dl RNA and the viral RNA as described above is 
the incorporation of Dl RNA sequences into the viral 
RNA. This has the desirable consequence of changing 
the viral RNA sequence, inasmuch as Dl RNAs can be 
manipulated by recombinant DNA methodology. The 
feasibility of this approach was first demonstrated by 
transfecting an mRNA 7 construct (representing the 
3'-end sequence of the viral RNA) (Figure 1) into cells 
infected with a ts mutant with a defective N gene. 32 As 
a result of this transfection, wild-type viruses with a 
functional N gene were obtained. Sequence analysis 
showed that they were bona fide recombinants, in 
which sequences from the transfected RNA replaced 


the defective gene in the original virus. Similar 
recombination events have also been observed when 
RNA fragments representing either the 5'- or 3'-ends 
of the viral RNAs were transfected into virus-infected 
cells. 33 In this case, the viral RNA containing the 
sequence of the transfected RNA fragments was 
detected by reverse transcription-polymerase chain 
reaction (RT-PCR), although the actual recombinants 
could not be isolated because of lack of selection 
markers. It is noteworthy that these transfected RNA 
fragments could not replicate; 32 ' 33 thus, theyprobably 
directly served as templates for RNA recombination. 
In addition, both the transfected positive- and neg¬ 
ative-strand RNAs could lead to recombination; 33 
suggesting that recombination may occur during both 
positive- and negative-strand RNA synthesis. So far, 
this type of recombination has not been demon¬ 
strated in the internal region of the RNA, where it is 
likely to occur at a lower efficiency because at least two 
cross-over events are required. 

More efficient recombination of thistype occurred 
when Dl RNAs that can replicate were used as the 
donor sequence. 34-36 These Dl RNAs typically contain 
sequences of both the 5'- and 3'-endsof the viral RNA 
genome, which include the RNA replication sig¬ 
nals. 37,38 Probably as a result of Dl RNA replication, 
more RNA substrates for recombination were gen¬ 
erated and more RNA replication events occurred, 
creating more opportunities for recombination. 
Using this approach, recombinant viruses which had 
incorporated Dl sequences into the viral RNA could 
be obtained at a higher efficiency than when non¬ 
replicating RNAs were used. 34 ' 35,39 Theoretically, 
either the 5'- or 3'-end sequences of the Dl RNAs 
could be incorporated into the viral RNA via this 
mechanism; however, so far, only recombination 
involving the 3'-end sequence has been achieved. The 
lack of 5'-end recombination may simply be due to the 
lack of appropriate selection markers. Although the 
recombination frequency has not been determined 
for these studies, this approach has proven to be very 
useful for introducing desirable sequences into the 
viral RNA. Utilized in thismanner, recombination isa 
valuable tool for coronavirus studies because an 
infectious coronavirus cDNA or recombinant RNA is 
still not available (no doubt due to the large size of 
RNA). This recombination strategy provides an alter¬ 
native method for introducing site-specific mutations 
into the viral RNA genome. It has generated the first 
interspecies recombinant virus between MHV and 
bovine coronavirus (BVC). 39 
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The effects of recombination on virus 
evolution 

Experimental evidence in tissue culture indicated that 
recombination can generate new viruses, even when 
no specific selection pressures were applied, as long as 
these recombinants have evolutionary advantages. 8 
The effects of recombination on the evolution of 
coronavirus Dl RNAs have also been demon¬ 
strated. 25,28 ' 29 Furthermore, in natural coronavirus 
infections, recombination also serves as an evolution¬ 
ary tool. This is particularly evident for IBV, many 
field isolates of which are recombinants between 
various IBV strains. 19 ' 22,40 Two classes of natural IBV 
recombinants have been identified so far. In the first, 
recombination occurs in the spike protein gene, 
conceivably allowing the virus to alter surface anti¬ 
genicity, and thus escape immunesurveillance in the 
animals. In the second class, recombination occurs in 
the 3'-end of the viral RNA, which may alter the 
replication ability of the RNA because this region 
contains regulatory sequences for RNA replication. 

Recombination may also explain the many gene 
insertion and rearrangement events in the various 
coronavirus genomes. When the genome structures of 
various coronaviruses are compared, it is apparent 
that IBV contains two additional ORFs between the N 
and M genes, which are not present in other 
coronaviruses (Figure 1). MHV also contains two 
novel genes (gene 2 and FIE protein gene) between 
the polymerase and spike protein genes. These genes 
must have been inserted into the coronavirus gen¬ 
omes by recombination between coronavirus and 
cellular or viral RNAs. Since the FI E protein of M H V 
shares sequence similarity with the influenza C virus 
FIEF (hemagglutinin-esterase-fusion) protein, 41 the 
FIE gene was likely the result of recombination 
between an ancestral coronavirus and influenza C 
virus. Furthermore, since the FI E gene is present only 
in some coronaviruses, this recombination event was 
probably a fairly recent occurrence. When the gen¬ 
ome of coronaviruses (e.g. M FI V) is compared to that 
of torovirus, which belongs to a different genus of the 
Coronaviridae family, it appears that gene 2 of 
coronavirus is present in torovirus as part of its gene 1, 
and part of the coronavirus FIE protein gene is 
present elsewhere (in gene 4) in the torovirus RNA 
(Figure l). 42 Since each coronavirus gene is flanked 
by a stretch of similar intergenic sequences, which 
serves as a transcription start signal (Figure 1), each 
viral gene may be regarded as a gene cassette, which 
can be easily moved to the various sites on the RNA 


genome by recombination between the intergenic 
sequences. 

In summary, recombination has played an impor¬ 
tant role in the past evolution of coronaviruses, and 
continues to play significant roles in the ongoing 
evolution of viruses in nature. 


Mechanism of RNA recombination 

It is supposed that recombination in coronaviruses, as 
in other RNA viruses, occurs by a copy-choice mecha¬ 
nism, 13 although there is still no direct evidence for 
this. In this model, recombination takes place during 
RNA replication, when RNA polymerase pauses at 
certain sites of RNA template. The nascent RNA 
transcripts separate from the original template, and 
then join themselves to a different RNA template to 
continue RNA synthesis. Depending on the rejoining 
sites, the resultant RNA recombination will be either 
homologous or nonhomologous. Several pieces of 
evidence support this model: RNA transcripts of 
discrete sizes have been detected in the M FI V-infected 
cells; 43 these RNAs appear to represent transcripts 
which have paused at sites of strong secondary 
structures and may participate in recombination. 
Since coronaviruses utilize a discontinuous transcrip¬ 
tion mechanism to synthesize mRNAs, the viral 
polymerase and nascent RNA transcripts must dis¬ 
sociate from the RNA template regularly during RNA 
transcription to fuse the leader RNA to a distant 
mRNA start site. Therefore, the coronavirus polymer¬ 
ase is probably not a processive enzyme and is able to 
dissociate from and rejoin itself to RNA templates 
with regularity. Indeed, one of the most frequently 
utilized MFIV recombination sites is at the junction 
between the leader RNA and the remainder of the 
RNA genome, 5 which is reminiscent of the joining of 
the leader and the body sequence during mRNA 
transcription. This result suggests that the inter¬ 
mediates of mRNA transcription may participate in 
recombination. This interpretation is also consistent 
with the finding that recombination frequency 
increases toward the 3'-end of the MFIV genome, 
suggesting that subgenomic mRNAs also participate 
in RNA recombination. 16,17 Thus, the mechanism of 
coronavirus RNA recombination may be similar to 
that of mRNA transcription. 

The copy choice mechanism of RNA recombination 
predicts that recombination will occur more fre¬ 
quently at RNA sites of strong secondary structure, 
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since these structures promote transcriptional paus¬ 
ing. 44 Indeed, MHV recombination has been shown 
to occur readily in hypervariable region of the spike 
protein gene, where frequent deletions occur, 45 sug¬ 
gesting that the same secondary structure causes both 
deletion and recombination. However, this inter¬ 
pretation may not be correct, because when recombi¬ 
nation was examined under nonselective conditions 
(e.g. when intracellular RNA from cells infected with 
two different viruses was examined by RT-PCR for the 
presence of recombinant viral RNA molecules without 
virus isolation), the cross-over sites in the recombi¬ 
nant RNAs were found to be distributed almost 
randomly. 46 Only after a few cycles of virus passage in 
tissue culture did the pattern of 'hotspots' of RNA 
recombination become apparent. This finding sug¬ 
gests that the so-called 'hotspots' detected in most 
coronavirus recombination studies may be the result 
of virus selection, but they do not represent the actual 
recombination hotspots. It is possible that recombi¬ 
nants with certain chimeric proteins derived from two 
different parental viruses have evolutionary advan¬ 
tages and thus will predominate during the course of 
virus growth, and other recombinant viruses will be 
selected against. The emergence of select recombi¬ 
nants was also seen in Dl recombination, where 
recombinant Dl RNAs containing a longer ORF were 
selected. 28,29 This interpretation may explain why 
coronavirus Dl RNAs can undergo nonhomologous 
recombination, 25 but coronaviral genomic RNAs can¬ 
not, i.e. because recombinant viruses generated by 
nonhomologous recombination may not grow 
competitively. 

How are the acceptor RNA sites selected? Con¬ 
ceivably, nascent RNA transcripts bind to homologous 
sequences on a different RNA template because of 
sequence complementarity, resulting in homologous 
recombination. The difficult aspect of this scenario is 
that the template RNAs and the nascent transcripts 
are likely complexed with other RNAs or proteins; 
therefore, they are not exposed. Furthermore, RNA 
polymerases are not known to initiate RNA synthesis 
from the internal regions of any RNA, except from 
certain transcription or replication signals. Thus, how 
the acceptor sites are selected and RNA synthesis 
resumes from those sites are theoretically difficult 
issues. One possibility is that the polymerase-nascent 
RNA complex recognizes certain RNA secondary 
structuresor RNA-protein complexeson the acceptor 
molecule by RNA-protein or protein-protein inter¬ 
actions rather than base-pairing. According to this 
scenario, the nascent RNA-polymerase complex may 


not bind to the homologous sites on the acceptor 
RNA. This explains the nonhomologous or aberrant 
homologous recombination seen in many RNA 
viruses. 13 It is clear that even in homologous recombi¬ 
nation, strict sequence complementarity at the cross¬ 
over sites is not necessary. 46 ' 47 Conceivably, once the 
nascent RNA has joined the acceptor RNA, there is 
additional processing of the transcript, such as cleav¬ 
age of the 3'-ends. The extent of cleavage may 
determine the final cross-over sites. Such a 3'-cleavage 
activity has been demonstrated in several types of 
DNA-dependent RNA polymerases. 48,49 In other RNA 
viruses, such as brome mosaic virus, the parental RNA 
templates may be held together by secondary struc¬ 
tures (complementary sequences) to facilitate recom¬ 
bination. 11 Such a case has not been demonstrated for 
coronavi ruses. 

Does recombination occur during (+)- or 
(-)-strand RNA synthesis? Since both ( +) and 
(-)-strand RNA fragments that cannot replicate could 
recombine with the viral RNA, 33 it stands to reason 
that recombination can take place during the synthe¬ 
sis of both strands. The efficiency of either strand in 
recombination has not been determined and may 
depend on the amount of the available template RNA. 
Another unresolved issue is whether any particular 
sequence would favor recombination, as shown for 
other RNA viruses. 50,51 


Conclusion 

Recombination is an important genetic mechanism 
for co ro navi ruses. It probably provides a mechanism 
for maintaining viral genomic stability, inasmuch as 
the coronavirus RNA has an extremely large size 
which renders it vulnerable to the accumulation of a 
large number of errorsduring RNA replication. It also 
provides a mechanism for the natural evolution of the 
virus and Dl RNAs. Several issues regarding coro¬ 
navirus recombination remain unresolved, partic¬ 
ularly concerning the mechanism of recombination, 
e.g. what is the sequence requirement for recombina¬ 
tion? What are the protein factors involved in recom¬ 
bination? Recombination may also provide a useful 
genetic tool for creating coronaviral mutants, which is 
not yet feasible by conventional reverse genetics 
methodology. Thus, coronavirus RNA recombination 
is an important biological phenomenon for coro¬ 
navirus and serves as an excellent model for viral RNA 
recombination in general. 
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